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! Abstract 



This paper investigates weak convergence of [/-statistics via approximation in 
probability. The classical condition that the second moment of the kernel of 
the underlying [/-statistic exists is relaxed to having | moments only (modulo 
a logarithmic term). Furthermore, the conditional expectation of the kernel is 
only assumed to be in the domain of attraction of the normal law (instead of 
the classical two- moment condition). 



> . 

^ 1 Introduction 

cn 
^. . 

Employing truncation arguments and the concept of weak convergence of self- 
, normalized and studentized partial sums, which were inspired by the works of 

^ \ Csorgo, Szyszkowicz and Wang in [H], P], [2] and [3j, we derive weak convergence 

results via approximations in probability for pseudo-self-normalized U-statistics 
and U-statistic type processes. Our results require only that (i) the expected 
^ \ value of the product of the kernel of the underlying U -statistic to the exponent 

■ I and its logarithm exists (instead of having 2 moments of the kernel) , and that 

(ii) the conditional expected value of the kernel on each observation is in the 
domain of attraction of the normal law (instead of having 2 moments) . Similarly 
relaxed moment conditions were first used by Csorgo, Szyszkowicz and Wang 
[5] for [/-statistics type processes for changepoint problems in terms of kernels 
of order 2 (cf. Remark 5). Our results in this exposition extend their work to 
approximating [/-statistics with higher order kernels. The thus obtained weak 
convergence results for [/-statistics in turn extend those obtained by R.G. Miller 
Jr. and P.K. Sen in [9j in 1972 (cf. Remark 3). The latter results of Miller and 
Sen are based on the classical condition of the existence of the second moment 
of the kernel of the underlying [/-statistic which in turns implies the existence 
of the second moment of the conditional expected value of the kernel on each 
of the observations. 
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2 Main results and Background 



Let Xi, X2, . . ., be a sequence of non-degenerate real- valued i.i.d. random vari- 
ables with distribution F. Let h{Xi, . . . , X^), symmetric in its arguments, 
be a Borel- measurable real- valued kernel of order m > 1, and consider the pa- 
rameter 9 = J ... J h{xi, . . . , Xm) dF{xi) . . . dF{xm) < 00. The corresponding 

{/-statistic (cf. Serfling [10] or Hoeffding [8]) is 

^ ^ C{n,m) 

where m < n and X]c(ra m) denotes the sum over C{n,in) = {1 < ii < . . . < 
im < n}. 

In order to state our results, we first need the following definition. 
Definition. A sequence X, Xi, X2, . . . , of i.i.d. random variables is said to 
be in the domain of attraction of the normal law (X £ DAN) if there exist 
sequences of constants An and i?„ > such that, as n — > 00, 

Remark 1. Furtherer to this definition of DAN, it is known that A„ can be 
taken as nK{X) and Bn = n^^'^£x{n), where ix{n) is a slowly varying function 
at infinity (i.e., lim„^oo ^e^^^n) ~ ^ k > 0), defined by the distribution 

of X. Moreover, ix{n) = ^JVar{X) > 0, if Var{X) < 00, and ix{n) 00, as 
n 00, if Var{X) = 00. Also X has all moments less than 2, and the variance 
of X is positive, but need not be finite. 

Also define the pseudo-self- normalized {/-process as follows. 



m 

, < t < — , 

Vn n 



where [.] denotes the greatest integer function, ■= Y17=i ^i(^i) ^i^d hi{x) 
E{h{Xi,...,X^)-9\X, = x). 

Theorem 1. // 

(a) E(|/i(Xi,...,X„)|tlog|/i(Xi,...,X„)|) < 00 and hi{Xi) G DAN, 
then, as n ^ 00, we have 

(b) [/[;,^] -^d N{0,to), for to e (0,1]; 
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\nt\ 

(c) ^[nt] W{t) on {D[0,l],p), where p is the sup-norm for functions in 

D[0, 1] and {W{t),0 < t < 1} is a standard Wiener process; 

(d) On an appropriate probability space for Xi, X2, ■ ■ ■ , we can construct a 
standard Wiener process {W{t),0 <t< 00} such that 



sup 

0<t<l 



Op(l). 



Remark 2. The statement (c), whose notion wih be used throughout, stands 
for the following functional central limit theorem (cf. Remark 2.1 in Csorgo, 
Szyszkowicz and Wang [3]). On account of (d), as n — > 00, we have 

g{Sin.]/Vn) ^dg{W{.)) 

for all g : D = D[0, 1] — > M that are {D,D) measurable and p-continuous, or 
p-continuous except at points forming a set of Wiener measure zero on (D,D), 
where S) denotes the cr-field of subsets of D generated by the finite-dimensional 
subsets of D. 



Theorem 1 is fashioned after the work on weak convergence of self- normalized 
partial sums processes of Csorgo, Szyszkowicz and Wang in [2], [3] and [4], which 
constitute extensions of the contribution of Gine, Gotze and Mason in [6]. 



As to hi{Xi) G DAN, since Ehi{Xi) = and hi{Xi),hi{X2),..., are i.i.d. 
random variables, Theorem 1 of [2j (cf. also Theorem 2.3 of [3]) in this context 
reads as follows. 

Lemma 1. As n ^ 00, the following statements are equivalent: 



(a) hi{Xi) G DAN ; 

(b) ^■'=\/^ -^,N{0,to) for toe {0,1]; 



(c) — ^^\r ' W{t) on {D[0, where p is the sup-norm metric 

for functions in D[0,1] and {W{t),0 < t < 1} is a standard Wiener 
process; 



(d) On an appropriate probability space for Xi,X2, ■ ■ ■ , we can construct a 
standard Wiener process {W{t),0 <t < 00} such that 



sup 

0<i<l 



EtJMx,) W{nt) 



Vn 



1 



Op(l). 
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Also, in the same vein, Proposition 2.1 of ^ for hi{Xi) G DAN reads as follows. 
Lemma 2. As n oo, the following statements are equivalent: 
(a) hi{Xi) e DAN; 

There is a sequence of constants Bn oo, such that 



(b) 



Br, 



^dN{0,to)for to £ (0,1]; 



(c) — >cl ^{t) on {D[0, l],p), where p is the sup-norm metric 

Bn 

for functions in D[0,1] and {W{t),0 < t < 1} is a standard Wiener 
process; 

(d) On an appropriate probability space for Xi,X2, ■ ■ ■ , we can construct a 
standard Wiener process {W(t),0 <t< oo} such that 



sup 

0<t<l 



Bn 



n2 



Op(l). 



In view of Lemma 2, a scalar normalized companion of Theorem 1 reads as 
follows. 



Theorem 2. // 

{a)E(\h{Xi,...,X^)\Uog\h{Xi,...,Xm)\) < oo and hi{Xi) G DAN, 
then, as n ^ oo, we have 

M NiO,to), where to e (0,1]; 

m Bn 

^ ' m Br, 



W{t) on (Z?[0,l],p), where p is the sup-norm for 



functions in -D[0,1] and {W{t),Q <t <\] is a standard Wiener process; 

(d) On an appropriate probability space for Xi,X2, ■ ■ ., we can construct a 
standard Wiener process {W{t),Q <t < oo} such that 



sup 

0<t<l 



[nt\ ?7[„t] - e W{nt) 



m Br, 



1 

712 



Op(l). 
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By defining 



Y*{t) = forO<t<"' 



n 



m\J nVar{hi{Xi)) 
and for t G T— , -1 , k = m, . . . ,n , 

L n ' nJ ' ' ' ' 



Y:{t) = yr:{^)+nit-^-^)(Y:{'^)-Y;/—^) 

we can state tlie already mentioned 1972 weak convergence result of Miller and 
Sen as follows. 

Theorem A. // 

(I) < E[{h{X^,X2, X^)-e)ih{X^,Xm+i,. . . , X2m-i)-0)] = Var{h^{X^)) < oo 
and 

(II) EhHXu...,Xm) < oo, 
then, as n ^ oo, 

Y:(.t)^dW{t) on (C[0,l],p), 

where p is the sup-norm for functions in C[0, 1] and {W{t),0 < t < 1} is a 
standard Wiener process . 



Remark 3. When m^{Xi, . . . , X„) < oo, first note that existence of the sec- 
ond moment of the kernel h{Xi, . . . , Xm) implies the existence of the second mo- 
ment of hi{Xi). Therefore, according to Remark 1, i?„ = \/^ E/i^(Xi). This 
means that under the conditions of Theorem A, Theorem 2 holds true and, via 
(c) of latter, it yields a version of Theorem A on D[0, 1] . We note in passing that 
our method of proofs differs from that of cited paper of Miller and Sen. We use 
a method of truncation a la [5j to relax the condition Kh'^{Xi, . . . , Xm) < oo to 

the less stringent moment condition E ^\h{Xi , . . . , Xm) \ ^ log \ h{Xi , . . . , A^) |^ < 

00 that, in turn, enables us to have /ii(Ai) G DAN in general, with the possi- 
bility of infinite variance. 

Remark 4. Theorem 1 of [2] (Theorem 2.3 in [3]) as well as Proposition 
2.1 of [3], continue to hold true in terms of Donskerized partial sums that are 
elements of C[0, 1]. Consequently, the same is true for the above stated Lemmas 

1 and 2, concerning hi{Xi) S DAN. This in turn, mutatis mutandis, renders 
appropriate versions of Theorems 1 and 2 to hold true in (C[0, l],/o). 
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Proof of Theorems 1 and 2. 

In view of Lemmas 1 and 2, in order to prove Theorems 1 and 2, we only have 
to prove the following theorem. 



Theorem 3. // E [\h{Xi, . . . , Xm)\^ \og\h{Xi, . . . )< oo and hi{Xi) G 

DAN then, as n ^ oo, we have 



SUPo<t<l 

and 

SUPo<t<l 



[nt] EfJMx, 



m 



op(l), 



m Br, 



Br, 



Op{l). 



(1) 



(2) 



Proof of Theorem 3. In view of (b) of Lemma 2 with to = 1, Corollary 
2.1 of [3], yields -^p 1. This in turn implies the equivalency of (1) and (2). 

n 

Therefore, it suffices to prove (2) only. 
It can be easily seen that 



sup 

0<i<l 



nt 



m Br. 



Br 



< sup 

0<t<^ 



Br, 



+ sup 

^<t<i 



m Bn 



Br 



m 



Since, as n ^ oo, we have > and, consequently, in view of (d) of Lemma 2 

n 



sup 

0<t<2 



Br, 



Op(l) 



in order to prove (2), it will be enough to show that 



sup 

2i<t<l 



m B 
or equivalently to show that 

-1 



Br 



op{l), 



max 

m<k<n 



k k 



mBr 



E {h{x,.,...,x,j-e)-l-Y^Ux^ 

^ ^ C(k,m) i=l 



max 

m<k<n 



k fk 



mBr 



Q) ' E [KXi,,...,Xij-e-h{x,, 



)-...-hi{X,J 



C(k,m) 



Op{l). 



(3) 
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The first equation of (3) follows from the fact that 



^ (hi{Xi,) + ... + hi{Xi, 

C{k,m) 



m f k 



k Km 



i=l 



where J2c(k m) denotes the sum over C{k, m) = {1 < ii < . . . < im < k}. To 
establish (3), without loss of generality we can, and shall assume that ^ = 0. 

Considering that for large n, — < —= (cf. Remark 1), to conclude (3), it 
will be enough to show that, as n — oo, the following holds: 



n 2 max 

m<k<n 



k 



m 



^ {h{Xi„. . .,XiJ - hi{Xi,) - ... - h,{Xij) 



C{k,m) 

To establish (4), for the ease of notation, let 



= op{l). (4) 



. . .,XiJ := h{Xi„. . .,X,JI^^^^^j^ - E{h{Xi„. . .,XiJI^^^^^j^), 

U^\Xi^) := E(/i«(Xi,, . . .,X,J\X,^), i = 1, . . . ,m, 

V'(i) . . . , XiJ := /^(i) . . . , XiJ-U^^ (X, . .-/^(i) {XiJ, 

h(^\x,„. . .,X,J := h{X,„. . -MI^^^^^J^ - . . . ,X,J7^|^|^^3^), 

) := E(/i(2)(x,,, . ..,X,J\X,A,j = l,...,m, 



where Ia is the indicator function of the set A. Now observe that 



n 2 max 

m<k<n 



m 



^ (h{Xi„. . .,XiJ - hi{Xi,) - ... - h,{Xij) 



C{k,m) 



< n 2 max 

m<k<n 



+ n 2 max 

m<k<n 



+ n 2 max 

m<k<n 



m 



(hiXi,,..., Xi^ ) - /i(i) . . . , X,^ )) 



C(fe,m) 



C(ik,m) 



k 



m 



C(k,m) 

:= Ji{n) + J2{n) + Jz{n). 

We will show that Js(n) = op(l), s = 1, 2, 3. 
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To deal with the term Ji (n) , first note that 

hiX,,,...,XiJ- /i(i) {X,,,..., X,^ ) = (X,,,..., Xi^ ) . 

Therefore, in view of Theorem 2.3.3 of [T] page 43, for e > 0, we can write 



n 2 max 

m<k<n 



m 



C{k,m) 



> e 



< e-^n^ ( mE|/i(2)(Xi,...,X^)| + n E|M2)(Xi, . . . , ) 
<e-^n^ 2mE|/i(Xi,...,X^)|+e-^n^ 2m E(|/i(Xi, . . . , X^)]/ 3 ) 

{\n\>n^ ) 

< e-^n^ 2m E\h{Xi, . . . + e'^ 2m E(|/i(Xi, . . . , t/^^^i^^a ^) 
— > 0, as n — > oo. 

4 

Here we have used the fact that E|/i(Xi, . . . < oo. The last line above 

implies that Ji(n) = op(l). 

Next to deal with J2(n.), first observe that 

m 

h,iX,,) + ... + h,{X,J - U'\X,,) - ... - U'KX,J = h^'Hxo- 

It can be easily seen that YlT=i h^'^H^ij) is symmetric in Xi^, . . . ,Xi^. Thus, 
in view of Theorem 2.3.3 of [T] page 43, for e > 0, we have 



n 2 max k 

m<k<n 



C(fc,m) \j=l 



> e 



<e-^n— 2mE\h{Xi,...,Xm)\+e-^n2 2mE{\h{Xi,...,Xm)\I,, 3) 



— > 0, as n — > 00, 
i.e., J2(n) = op(l). 



Note. Alternatively, one can use Etemadi's maximal inequality for partial 
sums of i.i.d. random variables, followed by Markov inequality, to show J2{n) = 

Op(l). 
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As for the term Jsin), first note that (^) ^ J2c{k,m) V'^^H-^^ni • • • j^im) is 
a [/-statistic. Consequently one more apphcation of Theorem 2.3.3 page 43 of 
[II yields, 



n 2 max k 

m<k<n 



^ ^ C{k,m) J 



<n-V2 m2 E(V^W(Xi,...,X„))' 




(5) 



A;=m+1 \ ^ ^ C{k,m) 

Observing that E(^(i)(Xi, . . .,X^))^ < C{m) E (/^^(Xi, . . . , |^|^^3 ^) , 

where C{m) is a positive constant that does not depend on n, 

EV(^) (X,, , . . . , ) = E(V'(i) (X,, , . . . , ) ) = 0, j = 1, . . . , m, 

and in view of Lemma B page 184 of [10], it follows that for some positive 
constants Ci(m) and C2{m) which do not depend on n, the R.H.S. of (5) is 
bounded above by 

e-2 n-i e(/i2(Xi,...,X„)/|^I^j^) (Ci(m) + C2{m) log(n)) 

< e~^Ci{m)n^E\h{Xi,...,Xm)\-^ 
+e~2 Ci(m 



+€-2^2 (m 
+e-2 C2{m 

+e'2 Ci{m 

+e-2 C72(m 
— > 0, as n — > oo. 



5 -^m 



n- log(n)E|/i(Xi,...,X„)|3 
E(|/i(Xi,...,X„)|tlog|/i(Xi 

E|/i(Xi,...,X^)|t 

log(n)E|/i(Xi,...,X^)|t 
E (\h{Xu . . . log . . . /(|;,|>„) 



Thus J3(n) = op(l). This also completes the proof of (4), and hence also that 
of Theorem 3. Now, as already noted above, the proof of Theorems 1 and 2 
follow from Theorem 3 and Lemmas 1 and 2. 

Remark 5. Studying a [/-statistics type process that can be written as a sum of 
three [/-statistics of order m = 2, Csorgo, Szyszkowicz and Wang in [5j proved 

4 

that under the slightly more relaxed condition that E|/i(Xi, . . . ,Xm)\^ < oo. 
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as n — > oo, we have 

max V {h{X„Xj) - hi{X^) - hi{X,)) = op{l). 

l<k<n ^ — ' 

l<i<j<k 

In the proof of the latter, the well known Doob maximal inequality for martin- 
gales was used, which gives us a sharper bound. The just mentioned inequality 
is not applicable for the processes in Theorems 1 and 2, even for [/-statistics 
of order 2. The reason for this is that the inside parts of the absolute values of 
Js{n), s = 1, 2, 3, are not martingales. Also, since Ylc{k m) i^i^h i • • • > -^im) ~ 
hi{Xi^) — ... — /ii(Xj„J), for m > 2, no longer form a martingale, it seems that 
the Doob maximal inequality is not applicable for the process 

n-"^+^ max (hiX,„ . . . , X,J - hi{X,,) - . . . - h{X,J) , 

l<k<n ^ — ' 
C{k,m) 

which is an extension of the [/-statistics parts of the process used by Csorgo, 
Szyszkowicz and Wang in [5] for m = 2. 



Due to the nonexistence of the second moment of the kernel of the un- 
derlying [/-statistic in the following example, the weak convergence result of 
Theorem A fails to apply. However, using Theorem 1 for example, one can still 
derive weak convergence results for the underlying [/-statistic. 

Example. Let Xi,X2, ■ ■ ■, be a sequence of i.i.d. random variables with the 
density function 



fix) 



\x — a\ \x — a\ > 1, a ^ 0, 
, elsewhere. 



Consider the parameter 9 = E™(Xi) = a™, where m > 1 is a positive integer, 
and the kernel h{Xi, . . . , X^) = YViLi -^i- Then with m, n satisfying n > m, 
the corresponding U-statistic is 



, -1 

n 



C(n,m) j=l 



Simple calculation shows that hi{Xi) = Xi a 



It is easy to check that E \^\h{Xi, . . . , Xm)\'^* log\h{Xi, . . . , Xm)\j < oo and 

that hi{Xi) £ DAN (cf. Gut [7], page 439). In order to apply Theorem 1 for 
this [/-statistic, define 



0<t< 



m 



n 



[/, 



nt] 



(m*0 Sc([nt], m) njLl 

(Er=i{^. a^-i - a-)2)' 



— <t<l. 

n 
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Then, based on (c) of Theorem 1, as n ^ oo, we have 

[nt] 



m 



U{nt]^dW{t) on{D[0,l],p), 



where p is the sup-norm metric for functions in D[0, 1] and {W{t), 0<i<l} 
is a standard Wiener process. Taking t = 1 gives us a central hmit theorem for 
the pseudo-self-normahzed [/-statistic 



i.e., as n — > oo, we have 

-/7:-^div(o,i). 

m 
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